Don't Count, Predict! An Automatic Approach to Learning Sentiment Lexicons for Short Text

نویسندگان

  • Duy-Tin Vo
  • Yue Zhang
چکیده

We describe an efficient neural network method to automatically learn sentiment lexicons without relying on any manual resources. The method takes inspiration from the NRC method, which gives the best results in SemEval13 by leveraging emoticons in large tweets, using the PMI between words and tweet sentiments to define the sentiment attributes of words. We show that better lexicons can be learned by using them to predict the tweet sentiment labels. By using a very simple neural network, our method is fast and can take advantage of the same data volume as the NRC method. Experiments show that our lexicons give significantly better accuracies on multiple languages compared to the current best methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Automatic Learning of Sentiment Lexicons

This paper describes a simple and principled approach to automatically construct sentiment lexicons using distant supervision. We induce the sentiment association scores for the lexicon items from a model trained on a weakly supervised corpora. Our empirical findings show that features extracted from such a machine-learned lexicon outperform models using manual or other automatically constructe...

متن کامل

Using Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media

Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...

متن کامل

Sentiment Lexicon-Based Features for Sentiment Analysis in Short Text

Sentiment lexicon-based features have proved their performance in recent work concerning sentiment analysis in Twitter. Automatic constructed lexicon features seem to be enough influential to attract the attention. In this paper, we propose a new metric to estimate the word polarity score, called natural entropy (ne), in order to construct a new sentiment lexicon based on Sentiment140 corpus. W...

متن کامل

MHSubLex: Using Metaheuristic Methods for Subjectivity Classification of Microblogs

In Web 2.0, people are free to share their experiences, views, and opinions. One of the problems that arises in web 2.0 is the sentiment analysis of texts produced by users in outlets such as Twitter. One of main the tasks of sentiment analysis is subjectivity classification. Our aim is to classify the subjectivity of Tweets. To this end, we create subjectivity lexicons in which the words into ...

متن کامل

Efficient Method Based on Combination of Deep Learning Models for Sentiment Analysis of Text

People's opinions about a specific concept are considered as one of the most important textual data that are available on the web. However, finding and monitoring web pages containing these comments and extracting valuable information from them is very difficult. In this regard, developing automatic sentiment analysis systems that can extract opinions and express their intellectual process has ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016